-
Notifications
You must be signed in to change notification settings - Fork 1.6k
promote kep-3673 to GA #5640
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
promote kep-3673 to GA #5640
Conversation
/cc @ruiwen-zhao |
@pacoxu: GitHub didn't allow me to request PR reviews from the following users: ruiwen-zhao. Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
|
||
#### GA | ||
|
||
- Change the default value of `serialize-image-pulls` to false and set the default value of `maxParallelImagePulls` to 2. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm - do we really want to change the default for "GA"?
With https://github.com/kubernetes/enhancements/tree/master/keps/sig-architecture/5241-beta-featuregate-promotion-requirements, we generally want the GA to effectively be kind of "no-op". Changing the default might be a bit unexpected here.
I know that it doesn't explicitly affect the user (it may affect them implicitly because some pods startup (due to image pulling) may be slower/faster), but still it might not be intuitive.
Let me ping other PRR approvers about it for their thoughts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If so, should we make this step as beta-2 for this KEP to change the default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I thought about it and we discussed that a bit with other PRR approvers. Based on that discussion I have few questions before we make a final decision:
- My understanding was that before this KEP, the behavior was that the default behavior was "unbounded parallel image pulls".
But now looking into this proposal I actually see this sentence: "Before this proposal, serialize-image-pulls is by default true", which suggests that this isn't true.
So either my understanding was incorrect or this sentence is not true or I don't understand the semantics ofserialize-image-pull
or something different.
Can you please clarify what is the current (before this proposal) default semantics?
- If indeed I was wrong and by default we were serializing image pulls by default, then switching the default seems like a "no-go" (no matter if it would be ga, another beta or anything else) - that just seems like a potentially breaking change to users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
serialized pull was the default. We are making the default marginally better by allowing 2 parallel pulls to improve reliability (one bad image will not break the whole node).
This is not strictly required. Many installations override this default anyways as it is not reliable anyways. It is just nice to do if somebody just trying it out.
If PRR is blocked on this, let's just keep old defaults and forget about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Previous default behavior: serialized pull. One image pulling may block a node pulling new images for new pods.
- Previous serialize-image-pull=false without maxParallelImagePulls set: unlimited parallel pull, even setting registryPullQPS and registryBurst(See https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/3673-kubelet-parallel-image-pull-limit/README.md#qpsburst-limits-on-kubelet-are-confusing). unlimited parallel pull may have too much IO pressure to disk.
- After this default value change, we want to allow 2 parallel pulls to improve reliability (one bad image will not break the whole node), as Sergey said.
I think 1 or 2 are not ideal default behavior and 3 would be a better default behavior, and this is a breaking
change in some extend, but acceptable.
PS. 2 is a very conservative and cautious approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I would like to distinguish two things:
- Changing the default from serialized to parallel
- Changing the default for number of ParallelImagePulls if serialize is false from unbounded to 2.
In (1) we change the default behavior even for administrators that aren't aware of how image pulls work. I agree that it helps in some cases (blocking image pulls by a single pull), but otoh it may negatively affect existing usecases (if I have two large pulls at the same time, it's actually better to download the first one and only then start the second, rather than having the first one to last 2x longer).
So I don't think we should really change that default - the mitigation for administrator is to configure their setup with parallel pulls and they can do that with this feature.
In (2) - given that "serialize-image-pulls=false" is not a default, then administrators are already aware of it and configuring that. I agree that "unbounded" is bad and switching to MaxImagePulls=2 in such case (if someone didn't set it explicitly) seems helpful. I'm fine with this change, but the consensus among prr approvers is that it would have to be a second beta then.
[And I would actually go with this option.]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to just GA this KEP then. Dragging this longer for the default configuration update that most environments needs to fine tune anyways doesn't sound attractive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There seems to be 3 proposals:
- Change both default values: serializeImagePulls: false and maxParallelImagePulls: 2
- Changing the default for number of ParallelImagePulls if serialize is false from unbounded to 2.
- do nothing and GA.
@wojtek-t +1 for 2, and @SergeyKanzhelev +1 for 3.
IIRC, we have discussed on whether we should set a default value for maxParallelImagePulls during KEP initialization.
if obj.SerializeImagePulls == nil {
// SerializeImagePulls is default to true when MaxParallelImagePulls
// is not set, and false when MaxParallelImagePulls is set.
// This is to save users from having to set both configs.
if obj.MaxParallelImagePulls == nil || *obj.MaxParallelImagePulls < 2 {
obj.SerializeImagePulls = ptr.To(true)
} else {
obj.SerializeImagePulls = ptr.To(false)
}
}
Currently, setting MaxParallelImagePulls=2
will enable parallel image pulling if SerializeImagePulls is not set.
So solution 2 may be something like below.
if obj.SerializeImagePulls == nil {
// SerializeImagePulls is default to true when MaxParallelImagePulls
// is not set, and false when MaxParallelImagePulls is set.
// This is to save users from having to set both configs.
if obj.MaxParallelImagePulls == nil || *obj.MaxParallelImagePulls < 2 {
obj.SerializeImagePulls = ptr.To(true)
} else {
obj.SerializeImagePulls = ptr.To(false)
}
+ } else if !*obj.SerializeImagePulls && obj.MaxParallelImagePulls == nil {
+ obj.MaxParallelImagePulls = ptr.To[int32](2)
+ }
I think I would +1 for do nothing and GA this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to do nothing and GA this KEP(adding the new maxParallelImagePulls configuration in kubelet, without a FG).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, setting MaxParallelImagePulls=2 will enable parallel image pulling if SerializeImagePulls is not set.
That makes sense. And I think that helps with what we need.
I'm ok with GA-ing as is given the above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: mrunalp, pacoxu, SergeyKanzhelev The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
One-line PR description: kep 3673 GA
Issue link: Kubelet limit of Parallel Image Pulls #3673
Other comments: for GA, we change default behavior to
serializeImagePulls: false
andmaxParallelImagePulls: 2
.